Skip to content

fix(tesseract): JOIN-model assembly for multi-stage reduce_by#10947

Merged
waralexrom merged 11 commits into
masterfrom
tesseract-multi-stage-count-distinct-fix
May 27, 2026
Merged

fix(tesseract): JOIN-model assembly for multi-stage reduce_by#10947
waralexrom merged 11 commits into
masterfrom
tesseract-multi-stage-count-distinct-fix

Conversation

@waralexrom
Copy link
Copy Markdown
Member

Summary

Multi-stage measures with reduce_by previously routed through a window
function — for non-additive base measures (e.g. count_distinct) this
emitted invalid SQL like countDistinct(countDistinct(x)) OVER (...).
This PR routes every Aggregate inode through a JOIN-based assembly by
default, preserves the window-based render as an opt-in optimisation for
the narrow safe subset (single dep, additive identity rollup, no
leaf-extending modifiers), and adds integration coverage for the
previously-unspecified shapes.

Changes

  • Aggregate inodes assemble via FullKeyAggregate + LEFT JOIN against
    a leaf-grain dim grid (new keys_input slot on the multi-stage logical
    plan). Modifiers apply in order: reduce_by/group_byadd_group_by
    time_shift (the swap was the root cause of the count_distinct
    sum-by-quarter regression).
  • Window-based render kept for sum-over-{sum, count} aggregates with
    no add_group_by, when the resulting partition is a strict subset of
    query dims (otherwise the window degenerates into a plain group-by and
    we skip it).
  • FullKeyAggregate keys-side handling unified: a single strategy
    handles both explicit keys_input (JOIN-model) and keys-derived-from-
    measure-refs (when the dialect lacks FULL JOIN).
  • Postgres integration suite: covers reduce_by with avg,
    count_distinct, max(sum), sum(max), sum(count), add_group_by
    combo, and a Calculate parent over mixed children; each test asserts
    which path the planner picked (window vs JOIN).
  • Test infra: Postgres test containers now clean up volumes on atexit.

Testing

  • cargo test --features integration-postgres multi_stage → 108 passed,
    6 ignored (unrelated).
  • cargo test --features integration-postgres reduce_by → 18 passed.
  • New tests pin expected aggregated values for the JOIN-model shapes on
    the integration seed.

…reduce_by

Add three ignored multi-stage integration tests covering count_distinct,
avg, and count outer aggregations with reduce_by. Their snapshots hold the
honest expected values so when the window-based path is replaced with the
JOIN-based model these tests turn green; until then they document the
silent-wrong / invalid-SQL bugs by name.
Extend the reduce_by integration suite with measures and tests that probe
the full grain matrix: sum-of-count and add_group_by+reduce_by combo are
numerically correct under the current window path and pass; max-of-sum,
sum-of-max stay ignored with snapshots pinned to JOIN-semantic results so
the assertions stop documenting the broken window shape.
Add FullKeyAggregateKeysInput { refs, keys } structure on FullKeyAggregate
and a parallel keys_input vec on MultiStageQueryDescription. The planner
now populates the description's keys_input when reduce_by / group_by
actually shrinks parent's state grain (add_group_by alone never triggers),
detected via a partition_new_state computed alongside new_state. Inputs
keep their existing full-grain shape and the renderer is untouched, so
this is purely metadata for the upcoming JOIN-based assembly.
When a multi-stage Aggregate inode shrinks parent's state grain via
reduce_by / group_by and no add_group_by is in play, route the CTE
through a JOIN-based path instead of the window function. The planner
keeps the original full-grain inputs as keys_input and runs a fresh
make_childs round at partition grain for the measure-side; member
planner threads both sides into a FullKeyAggregateKeysInput and skips
the Window flag. A new KeysInputFullKeyAggregateStrategy renders the
result as LEFT JOIN of measure-side refs against the keys-side ref,
JOIN-keys derived from schema intersection.

Combo cases (add_group_by + reduce_by) and Rank still go through the
legacy window path; non-idempotent outer types (count, count_distinct)
remain ignored until measure references can be threaded through
add_ungrouped_measure_reference.

reduce_by suite: 12 passing, 2 ignored. avg / max_of_sum / sum_of_max
now converge to the JOIN-semantic results.
…m-of-…

Outer multi-stage type for rolling up count / count_distinct base measures
to a coarser grain is `sum` — same rollup rule the pre-aggregation path
uses (Count → sum via pre_aggregate_wrap). The earlier `type: count` /
`type: count_distinct` shapes were the wrong user-facing pattern: outer
`count(...)` on a single broadcasted row collapses to 1.

Switch both ignored tests to `type: sum` with `sql: "{CUBE.count}"` /
`sql: "{CUBE.unique_customers}"`. Snapshots pin the JOIN-semantic results
(3 / 6 / 6 total orders per status, 3 / 3 / 3 distinct customers per
status, broadcast across categories). reduce_by suite: 14 passing, 0
ignored.
…ategy

Fold the JOIN-model assembly into the existing `KeysFullKeyAggregateStrategy`
so one code path handles both cases:
  - keys-side from the logical plan's `keys_input` (JOIN-model: measure
    refs live at partition grain, keys refs at leaf grain);
  - keys-side derived from measure refs themselves (legacy keys path used
    when the dialect lacks FULL JOIN).
JOIN-keys come from the intersection of the keys-side dim set with each
measure ref's schema, which collapses to partition grain in JOIN-model and
to full grain in the legacy path. The separate `KeysInputFullKeyAggregateStrategy`
is removed.

`FullKeyAggregateKeysInput` drops the redundant `keys` field — the strategy
reads key dims off `refs[0].schema()` directly. Add `test_reduce_by_calculated_over_two`
covering a Calculate inode that sums two reduce_by Aggregate children.
reduce_by suite: 15 passing.
Pass `-v` to `docker rm -f` in the atexit-cleanup callback so anonymous
volumes attached to each test Postgres container are removed alongside
the container itself. Without it disk usage steadily grows across test
runs.
Drop the legacy Window path for Aggregate inodes — `window_function_to_use`
is now always `None` for them (Rank still uses Rank). The unused branch
stays as a comment for context.

In the planner: apply modifiers in the right order — first reduce_by /
group_by (shrink parent grain), then add_group_by (extend leaf). The
previous order silently dropped `add_group_by`-added dims when
`group_by` was a keep-only filter, breaking the leaf grain for cases
like multi-stage count_distinct sum-by-quarter (the pre-aggregation
match relied on the leaf staying at full leaf grain).

With the corrected order `add_group_by + reduce_by` combo lands on the
JOIN-model: leaf at `state - reduce_by + add_group_by`, keys_input
rebuilt at parent state via a second make_childs. Combo numerical case
(test_reduce_by_add_group_by_combo) now matches its honest expected
result through the JOIN path, without the window fallback.

Suite: 925 passing, 11 ignored (none reduce_by-related).
- assert in reduce_by tests which path (window vs JOIN-model) is taken
- simplify is_window_path_eligible: single-slice dep pattern + Result-typed
  as_measure(), no expect()
- drop branch-history language ("legacy", "now", "see <case>") from
  multi-stage docs and inline comments
@github-actions github-actions Bot added the rust Pull requests that update Rust code label May 27, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 27, 2026

Claude finished @waralexrom's task in 2m 37s —— View job


Code Review: fix(tesseract): JOIN-model assembly for multi-stage reduce_by

  • Gather context and read all changed files
  • Review core logic changes (multi-stage planner, member query planner)
  • Review FullKeyAggregate strategy changes
  • Review logical plan changes
  • Review test coverage and snapshot correctness
  • Review test infrastructure changes
  • Compile findings and post review

Overall Assessment

This is a well-structured fix for a correctness bug — countDistinct(countDistinct(x)) OVER (...) is clearly invalid SQL, and the new JOIN-model assembly is the right approach. The window-path is correctly preserved as an optimisation for the narrow safe subset (sum over {sum, count}). The code is well-commented, the modifier ordering is sound (reduce_by/group_byadd_group_bytime_shift), and the integration test matrix covers the important shapes.

Findings

1. Duplicate measure definitions in YAML fixture (Minor)

count_reduce_category and count_sum_reduce_category in integration_multi_stage.yaml:111-130 have identical definitions:

- name: count_reduce_category       # line 111
  type: sum
  sql: "{CUBE.count}"
  multi_stage: true
  reduce_by:
      - orders.category

- name: count_sum_reduce_category    # line 125
  type: sum
  sql: "{CUBE.count}"
  multi_stage: true
  reduce_by:
      - orders.category

Both are sum over count with the same reduce_by. Consider removing one and renaming the remaining one to clarify its intent (e.g. sum_of_count_reduce_category), or giving them different semantics.

Fix this →

2. Missing path assertion in test_reduce_by_sum_of_count (Minor)

reduce_by.rs:104 — every other new test calls either assert_uses_window(&sql) or assert_no_window(&sql) to pin which planning path was chosen. This test just calls ctx.build_sql(query).unwrap() without the assertion. Given the test comment says "Numerically correct under the current window path", it should assert that:

let sql = ctx.build_sql(query).unwrap();
assert_uses_window(&sql);  // sum-of-count is window-eligible

This would make the test fail-fast if the planner accidentally changes paths for this shape.

Fix this →

3. key_dims derived from first keys ref only — assumes uniform schemas (Low risk)

keys_aggregate_strategy.rs:66-67:

let key_dims: Vec<Rc<MemberSymbol>> = if has_explicit_keys && !keys_refs.is_empty() {
    keys_refs[0].schema().all_dimensions().cloned().collect()

This takes the key dimensions from the first keys ref. If different keys refs could carry different dimension sets, this would silently drop dims. Currently the planner builds all keys_input refs from the same state, so they should always be uniform — but it's fragile if that invariant changes. A defensive check or a doc-comment anchoring the invariant would help future maintainers.

4. partition_filter mirrors member_partition_by_logical — potential drift (Low risk)

multi_stage_query_planner.rs:258-280 documents itself as a "Mirror of MultiStageMemberQueryPlanner::member_partition_by_logical." When two functions must stay in sync, bugs can arise from one being updated without the other. Consider extracting the shared logic into a common utility, or at minimum linking them with a cross-reference comment in member_partition_by_logical as well, so an edit to either function reminds the developer to update the other.

5. make_childs called twice for non-window Aggregate inodes (Performance consideration)

multi_stage_query_planner.rs:538-579 — for the JOIN-model path, make_childs is called once at partition grain (for input) and once at parent grain (for keys_input). This is correct since the two calls build descriptions at different granularities. However, the second call passes the same descriptions and resolved_multi_stage_dimensions mutables, and identical leaf deps may be deduped by the existing CTE-state machinery. Worth confirming that double-calling doesn't create redundant CTE entries in pathological cases (e.g. when partition grain == parent grain but any_missing still evaluates true due to reference-chain differences).

6. Null-safe join for the explicit-keys path (Correctness note)

keys_aggregate_strategy.rs:182-185:

JoinCondition::new_dimension_join(conditions, !has_explicit_keys),

The comment says null-safety is "unnecessary but stays correct" for the explicit-keys path. This is sound — !has_explicit_keys correctly disables null-safe joins for the JOIN-model. Good decision to keep it simple.

Positive highlights

  • Well-scoped optimization guard: is_window_path_eligible is precise — exactly sum over {sum, count} with a single dep. The associativity reasoning is correct and well-documented.
  • Modifier ordering: The reduce_by/group_byadd_group_bytime_shift ordering is well-commented and the root cause (swapped order causing the count_distinct regression) is clearly identified.
  • Test matrix: Excellent coverage of measure combinations — avg, count_distinct, max(sum), sum(max), sum(count), add_group_by combo, and Calculate over mixed children. Each test pins the expected planning path (window vs JOIN), which is a great regression guard.
  • Clean separation: The FullKeyAggregateKeysInput abstraction cleanly separates the "where do keys come from" concern from the join-building logic in KeysFullKeyAggregateStrategy.
  • Docker cleanup (-v flag): Small but welcome improvement to prevent volume leak in test containers.

Summary

The core fix is correct and well-tested. The main actionable items are:

  1. Add the missing assert_uses_window in test_reduce_by_sum_of_count for consistency
  2. Consider removing the duplicate YAML measure definition
  3. Consider adding an invariant assertion or doc-comment for the key_dims derivation from keys_refs[0]

The rest are low-risk observations for future maintainability. The PR is in good shape overall.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.57%. Comparing base (cfe7f38) to head (de52ffb).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10947      +/-   ##
==========================================
+ Coverage   78.98%   83.57%   +4.59%     
==========================================
  Files         470      254     -216     
  Lines       93132    76126   -17006     
  Branches     3454        0    -3454     
==========================================
- Hits        73559    63623    -9936     
+ Misses      19070    12503    -6567     
+ Partials      503        0     -503     
Flag Coverage Δ
cube-backend ?
cubesql 83.57% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- drop duplicate `count_sum_reduce_category` measure (identical to
  `count_reduce_category`) and the redundant `test_reduce_by_sum_of_count`
  it covered — `test_reduce_by_count` already exercises sum-of-count with
  the window-path assertion
- add FIXME on `partition_filter` pointing to its sibling
  `member_partition_by_logical` — the two reshape the same way on
  different inputs and should be merged
@waralexrom waralexrom closed this May 27, 2026
@waralexrom waralexrom reopened this May 27, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 27, 2026

Claude finished @waralexrom's task in 2m 35s —— View job


Code Review: fix(tesseract): JOIN-model assembly for multi-stage reduce_by

  • Gather context and read all changed files
  • Review core planner logic (multi_stage_query_planner.rs)
  • Review member query planner changes (member_query_planner.rs, member.rs)
  • Review FullKeyAggregate strategy and logical plan changes
  • Review query description changes
  • Review test coverage, fixtures, and snapshots
  • Review test infrastructure changes
  • Compile findings and post review

Overall Assessment

This is a well-executed correctness fix for a real bug — countDistinct(countDistinct(x)) OVER (...) is invalid SQL, and routing non-additive aggregates through a JOIN-based assembly is the right architectural choice. The previous review's actionable items (duplicate YAML measure, missing path assertion, FIXME on partition_filter) have all been addressed in 794d5e0.

The code is well-structured: the window-path optimization is preserved for the narrow safe subset, the modifier ordering is correct, and the test matrix covers the important shapes. I have a few observations, mostly low-risk and for future consideration.


Findings

1. key_dims derived from first keys ref assumes uniform schemas (Low risk)

keys_aggregate_strategy.rs:66-67:

let key_dims: Vec<Rc<MemberSymbol>> = if has_explicit_keys && !keys_refs.is_empty() {
    keys_refs[0].schema().all_dimensions().cloned().collect()

This takes key dimensions from the first keys ref only. Currently safe because make_childs generates all keys_input refs from the same state, so they carry identical dimension schemas. However, this invariant is implicit — if make_childs ever produces refs at different grains (e.g. from a heterogeneous CASE-SWITCH dependency tree), this would silently drop dims from refs beyond the first.

A debug assertion or doc-comment anchoring the invariant would help:

debug_assert!(
    keys_refs.windows(2).all(|w| {
        let a: HashSet<_> = w[0].schema().all_dimensions().map(|d| d.full_name()).collect();
        let b: HashSet<_> = w[1].schema().all_dimensions().map(|d| d.full_name()).collect();
        a == b
    }),
    "keys_input refs must carry identical dimension schemas"
);

2. partition_filter duplication — FIXME is present, consider extracting sooner (Low risk)

multi_stage_query_planner.rs:258-287 — The FIXME is now in place (good), but the two copies (partition_filter and member_partition_by_logical) operate on different input types (&Vec<Rc<MemberSymbol>> vs the all_dimensions() chain), making a straightforward extraction non-trivial. Worth flagging for a follow-up so it doesn't stay a FIXME indefinitely.

3. make_childs called twice for non-window Aggregate inodes — dedup is sound

multi_stage_query_planner.rs:541-583 — For the JOIN-model path, make_childs is called once at partition grain (for input) and once at parent grain (for keys_input). The second call passes the same descriptions Vec, so the dedup in make_queries_descriptions (line 455-459) will reuse any already-built CTE whose (member, state) matches. When partition grain == parent grain, any_missing evaluates to false so keys_input stays empty — correct. I verified that no redundant CTEs are created in the equal-grain case.

4. Fast-path bypass changed for single-ref + explicit keys (Correctness note)

keys_aggregate_strategy.rs:120-123:

if data_queries.len() == 1 && !has_explicit_keys {

Previously, the fast path fired for data_queries.len() == 1 unconditionally, which would have skipped the JOIN when a single measure ref had explicit keys. The new guard (&& !has_explicit_keys) is correct — with explicit keys, even a single measure ref needs the JOIN to broadcast from partition grain to leaf grain.

5. Strategy routing in mod.rskeys_input.is_some() takes priority (Correct)

mod.rs:38-43 — The new keys_input().is_some() check is placed before the has_dimensions() and supports_full_join() checks. This is correct: when the planner has built explicit keys, the KeysFullKeyAggregateStrategy must be used regardless of the dialect's FULL JOIN support.

6. Null-safe join disabled for explicit-keys path (Correct)

keys_aggregate_strategy.rs:182-185:

JoinCondition::new_dimension_join(conditions, !has_explicit_keys),

With explicit keys, the keys-side is a DISTINCT projection from the leaf CTE which already has all possible dim values — null-safe join is unnecessary. With derived keys (no explicit keys), null-safety is needed for the FULL JOIN-equivalent semantics. The boolean is correctly inverted.


Positive Highlights

  • Well-scoped optimization guard: is_window_path_eligible at lines 233-256 is precise and well-documented. The sum over {sum, count} check with single-dep requirement is the correct mathematical subset for associative rollup.
  • Modifier ordering: The comment block at lines 499-510 clearly documents the reduce_by/group_byadd_group_bytime_shift ordering and why step 1 must precede step 2. This was the root cause of the regression and is now well-guarded.
  • Test matrix: 7 new integration tests covering avg, count_distinct, max(sum), sum(max), sum(count), add_group_by combo, and Calculate over mixed children. Each pins expected values via snapshots AND asserts the planning path (window vs JOIN) — excellent regression coverage.
  • Clean abstraction: FullKeyAggregateKeysInput separates the "where do keys come from" concern from the join-building logic without over-engineering.
  • Addressed previous review: All three actionable findings from the first review (duplicate YAML, missing assertion, FIXME comment) were handled in commit 794d5e0.
  • Docker cleanup (-v flag in pg_service.rs): Small but welcome fix preventing Docker volume leaks.

Summary

The core fix is correct, well-tested, and the architecture cleanly separates the window-optimisation path from the JOIN-model path. The main follow-up item is the partition_filter / member_partition_by_logical duplication (already FIXME'd). No blocking issues found.

@waralexrom waralexrom merged commit eab34f2 into master May 27, 2026
224 of 233 checks passed
@waralexrom waralexrom deleted the tesseract-multi-stage-count-distinct-fix branch May 27, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants